Using Sample Size to Limit Exposure to Data Mining

نویسنده

  • Chris Clifton
چکیده

Data mining introduces new problems in database security. The basic problem of using non-sensitive data to infer sensitive data is made more difficult by the “probabilistic” inferences possible with data mining. This paper shows how lower bounds from pattern recognition theory can be used to determine sample sizes where data mining tools cannot obtain reliable results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sample size determination for logistic regression

The problem of sample size estimation is important in medical applications, especially in cases of expensive measurements of immune biomarkers. This paper describes the problem of logistic regression analysis with the sample size determination algorithms, namely the methods of univariate statistics, logistics regression, cross-validation and Bayesian inference. The authors, treating the regr...

متن کامل

The Effect of Estimation Error on Risk-adjusted Bernoulli GEWMA Control Chart in Multistage Healthcare Processes

Background and objectives: Risk-adjusted Bernoulli control chart is one of the main tools for monitoring multistage healthcare processes to achieve higher performance and effectiveness in healthcare settings. Using parameter estimates can lead to significantly deteriorate chart performance. However, so far, the effect of estimation error on this chart in which healthcare ...

متن کامل

A Methodology to Estimate Ores Work Index Values, Using Miduk Copper Mine Sample

It is always attempted to reduce the costs of comminution in mineral processing plants. One of thedifficulties in size reduction section is not to be designed properly. The key factor to design size reductionunits such as crushers and grinding mills, is ore’s work index. The work index, wi, presents the oregrindability, and is used in Bond formula to calculate the required energy. Bond has defi...

متن کامل

Presented a method for estimating the cost of software using PCA to reduce the size and with the help of data mining

  These days, data mining one of the most significant issues. One field data mining is a mixture of computer science and statistics which is considerably limited due to increase in digital data and growth of computational power of computer. One of the domains of data mining is the software cost estimation category. In this article, classifying techniques of learning algorithm of machine ...

متن کامل

Open pit limit optimization using dijkstra’s algorithm

In open-pit mine planning, the design of the most profitable ultimate pit limit is a prerequisite to developing a feasible mining sequence. Currently, the design of an ultimate pit is achieved through a computer program in most mining companies. The extraction of minerals in open mining methods needs a lot of capital investment, which may take several decades. Before the extraction, the p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Computer Security

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2000